A Voice Dictation System for a Million-Word Czech Vocabulary
نویسندگان
چکیده
The paper describes a set of techniques developed for discrete dictation within a vocabulary that contains up to a million entries, which is one of the main challenges in highly inflected languages like Czech. We present our approach to building an efficiently coded tree lexicon with suffix sub-trees and morphologic classification. Acoustic modeling is based on either monophone, diphone, or triphone models. Lexical and grammatical constraints are represented by unigrams and bigrams, where the latter have form of binary matrix describing grammatical admissibility between word categories. The complete system has been evaluated on a 792,338 word lexicon. Its real-time implementation yields a word-error rate smaller than 12%.
منابع مشابه
Very large vocabulary voice dictation for mobile devices
This paper deals with optimization techniques that can make very large vocabulary voice dictation applications deployable on recent mobile devices. We focus namely on optimization of signal parameterization (frame rate, FFT calculation, fixedpoint representation) and on efficient pruning techniques employed on the state and Gaussian mixture level. We demonstrate the applicability of the propose...
متن کاملStudy on Cross-Lingual Adaptation of a Czech LVCSR System towards Slovak
This paper deals with cross-lingual adaptation of a Large Vocabulary Continuous Speech Recognition (LVCSR) system between two similar Slavic languages – from Czech to Slovak. The proposed adaptation scheme is performed in two consecutive phases and it is focused on acoustic modeling and phoneme and pronunciation mapping. It also utilizes language similarities between the source and the target l...
متن کاملMAP Based Speaker Adaptation in Very Large Vocabulary Speech Recognition of Czech
The paper deals with the problem of efficient adaptation of speech recognition systems to individual users. The goal is to achieve better performance in specific applications where one known speaker is expected. In our approach we adopt the MAP (Maximum A Posteriori) method for this purpose. The MAP based formulae for the adaptation of the HMM (Hidden Markov Model) parameters are described. Sev...
متن کاملDesign and development of voice controlled aids for motor-handicapped persons
In this paper we present two voice-operated systems that have been designed for Czech motor-handicapped people to allow them full access to computers and computer based services. The programs, which are named MyVoice and MyDictate, are complementary in their functions. Both employ ASR engines developed in our lab. The former is used primarily as a midsize-vocabulary (up to 10K words) voice comm...
متن کاملEasyCmd: Navigation by Voice Commands
In this paper we present a system named EasyCmd that provides voice navigation on the desktop of Microsoft Window 9x system. Speech recognition engine for EasyCmd is much similar to that for dictation machine. Statistical Knowledge Based Frame Synchronous Search algorithm (SKBFSS) and Word Search Tree (WST) technologies are applied for acoustic decoding. Recognition Score Gap (RSG) is used for ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004